A Machine Learning Ensemble Approach to Churn Prediction Developing and Comparing Local Explanation Models on Top of a Black-Box Classifier
نویسنده
چکیده
Churn prediction methods are widely used in Customer Relationship Management and have proven to be valuable for retaining customers. To obtain a high predictive performance, recent studies rely on increasingly complex machine learning methods, such as ensemble or hybrid models. However, the more complex a model is, the more difficult it becomes to understand how decisions are actually made. Previous studies on machine learning interpretability have used a global perspective for understanding black-box models. This study explores the use of local explanation models for explaining the individual predictions of a Random Forest ensemble model. The churn prediction was studied on the users of Tink – a finance app. This thesis aims to take local explanations one step further by making comparisons between churn indicators of different user groups. Three sets of groups were created based on differences in three user features. The importance scores of all globally found churn indicators were then computed for each group with the help of local explanation models. The results showed that the groups did not have any significant differences regarding the globally most important churn indicators. Instead, differences were found for globally less important churn indicators, concerning the type of information that users stored in the app. In addition to comparing churn indicators between user groups, the result of this study was a well-performing Random Forest ensemble model with the ability of explaining the reason behind churn predictions for individual users. The model proved to be significantly better than a number of simpler models, with an average AUC of 0.93.
منابع مشابه
Application of ensemble learning techniques to model the atmospheric concentration of SO2
In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...
متن کاملImproving Accuracy in Intrusion Detection Systems Using Classifier Ensemble and Clustering
Recently by developing the technology, the number of network-based servicesis increasing, and sensitive information of users is shared through the Internet.Accordingly, large-scale malicious attacks on computer networks could causesevere disruption to network services so cybersecurity turns to a major concern fornetworks. An intrusion detection system (IDS) could be cons...
متن کاملDevelopment of an Ensemble Multi-stage Machine for Prediction of Breast Cancer Survivability
Prediction of cancer survivability using machine learning techniques has become a popular approach in recent years. In this regard, an important issue is that preparation of some features may need conducting difficult and costly experiments while these features have less significant impacts on the final decision and can be ignored from the feature set. Therefore, developing a machine for p...
متن کاملMachine learning algorithms in air quality modeling
Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...
متن کاملFault Detection of Anti-friction Bearing using Ensemble Machine Learning Methods
Anti-Friction Bearing (AFB) is a very important machine component and its unscheduled failure leads to cause of malfunction in wide range of rotating machinery which results in unexpected downtime and economic loss. In this paper, ensemble machine learning techniques are demonstrated for the detection of different AFB faults. Initially, statistical features were extracted from temporal vibratio...
متن کامل